An Oversampling Method for Class Imbalance Problems on Large Datasets
نویسندگان
چکیده
Several oversampling methods have been proposed for solving the class imbalance problem. However, most of them require searching k-nearest neighbors to generate synthetic objects. This requirement makes time-consuming and therefore unsuitable large datasets. In this paper, an method problems that do not neighbors’ search is proposed. According our experiments on datasets with different sizes imbalance, at least twice as fast 8 fastest reported in literature while obtaining similar quality.
منابع مشابه
C4.5 Consolidation Process: An Alternative to Intelligent Oversampling Methods in Class Imbalance Problems
In real world problems solved using data mining techniques, it is very usual to find data in which the number of examples of one of the classes is much smaller than the number of examples of the rest of the classes. Many works have been done to deal with these problems known as class imbalance problems. Most of them focus their effort on data resampling techniques so that training data would be...
متن کاملConsolidation Process: An Alternative to Intelligent Oversampling Methods in Class Imbalance Problems
In real world problems solved using data mining techniques, it is very usual to find data in which the number of examples of one of the classes is much smaller than the number of examples of the rest of the classes. Many works have been done to deal with these problems known as class imbalance problems. Most of them focus their effort on data resampling techniques so that training data would be...
متن کاملGenerative Oversampling for Mining Imbalanced Datasets
One way to handle data mining problems where class prior probabilities and/or misclassification costs between classes are highly unequal is to resample the data until a new, desired class distribution in the training data is achieved. Many resampling techniques have been proposed in the past, and the relationship between resampling and cost-sensitive learning has been well studied. Surprisingly...
متن کاملAn Efficient Numerical Method for a Class of Boundary Value Problems, Based on Shifted Jacobi-Gauss Collocation Scheme
We present a numerical method for a class of boundary value problems on the unit interval which feature a type of exponential and product nonlinearities. Also, we consider singular case. We construct a kind of spectral collocation method based on shifted Jacobi polynomials to implement this method. A number of specific numerical examples demonstrate the accuracy and the efficiency of the propos...
متن کاملAddressing the Class Imbalance Problem in Medical Datasets
A well balanced dataset is very important for creating a good prediction model. Medical datasets are often not balanced in their class labels. Most existing classification methods tend to perform poorly on minority class examples when the dataset is extremely imbalanced. This is because they aim to optimize the overall accuracy without considering the relative distribution of each class. In thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2022
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app12073424